Learning Dynamic Policies from Demonstration
نویسندگان
چکیده
We address the problem of learning a policy directly from expert demonstrations. Typically, this problem is solved with a supervised learning method such as regression or classification to learn a reactive policy. Unfortunately, reactive policies lack the ability to model long-range dependancies and this omission can result in suboptimal performance. So, we take a different approach. We observe that policies and dynamical systems are mathematical duals, and then use this fact to leverage the rich literature on system identification to learn dynamic policies with state directly from demonstration. Many system identification algorithms have desirable properties like the ability to model long-range dependancies, statistical consistency, and efficient off-the-shelf implementations. We show that by employing system identification algorithms to learning from demonstration problems, all of these properties can be carried over to the learning from demonstration domain. We further show that these properties can be beneficial in practice by applying state-of-the-art system identification algorithms to real-world direct learning from demonstration problems.
منابع مشابه
Learning Movement Primitives
This paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor co...
متن کاملMulti-Step Learning to Search for Dynamic Environment Navigation
While navigation could be done using existing rule-based approaches, it becomes more attractive to use learning from demonstration (LfD) approaches to ease the burden of tedious rule designing and parameter tuning procedures. In our previous work, navigation in simple dynamic environments is achieved using the Learning to Search (LEARCH) algorithm with a proper feature set and the proposed data...
متن کاملLearning Stable Task Sequences from Demonstration with Linear Parameter Varying Systems and Hidden Markov Models
The problem of acquiring multiple tasks from demonstration is typically divided in two sequential processes: (1) the segmentation or identification of different subgoals/subtasks and (2) a separate learning process that parameterizes a control policy for each subtask. As a result, segmentation criteria typically neglect the characteristics of control policies and rely instead on simplified mode...
متن کاملLearning from Demonstration: Communication and Policy Generation
Learning from demonstration utilizes human expertise to program a robot. We believe this approach to robot programming will facilitate the development and deployment of general purpose personal robots that can adapt to specific user preferences. Demonstrations can potentially take place across a wide variety of environmental conditions. In this paper we address how learning from demonstration c...
متن کاملTeacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot
Task demonstration is an effective technique for developing robot motion control policies. As tasks becomemore complex, however, demonstration can becomemore difficult. In this work, we introduce an algorithm that uses corrective human feedback to build a policy able to performanovel task, by combining simpler policies learned from demonstration. While some demonstration-based learning approach...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013